Stark County
Geological Inference from Textual Data using Word Embeddings
Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil
This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.
- Europe > United Kingdom (0.05)
- Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
- North America > Canada > British Columbia (0.04)
- (32 more...)
- Energy (0.94)
- Materials > Metals & Mining > Lithium (0.50)
Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice
Shah, Karan, Butler, Julie, Knaub, Alexis, Zenginoğlu, Anıl, Ratcliff, William, Soltanieh-ha, Mohammad
It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- (5 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
AmbigDocs: Reasoning across Documents on Different Entities under the Same Name
Lee, Yoonsang, Ye, Xi, Choi, Eunsol
Different entities with the same name can be difficult to distinguish. Handling confusing entity mentions is a crucial skill for language models (LMs). For example, given the question "Where was Michael Jordan educated?" and a set of documents discussing different people named Michael Jordan, can LMs distinguish entity mentions to generate a cohesive answer to the question? To test this ability, we introduce a new benchmark, AmbigDocs. By leveraging Wikipedia's disambiguation pages, we identify a set of documents, belonging to different entities who share an ambiguous name. From these documents, we generate questions containing an ambiguous name and their corresponding sets of answers. Our analysis reveals that current state-of-the-art models often yield ambiguous answers or incorrectly merge information belonging to different entities. We establish an ontology categorizing four types of incomplete answers and automatic evaluation metrics to identify such categories. We lay the foundation for future work on reasoning across multiple documents with ambiguous entities.
- Europe > France (0.04)
- North America > United States > Rhode Island > Newport County > Newport (0.04)
- North America > United States > New Jersey (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Neural Approaches to Entity-Centric Information Extraction
Artificial Intelligence (AI) has huge impact on our daily lives with applications such as voice assistants, facial recognition, chatbots, autonomously driving cars, etc. Natural Language Processing (NLP) is a cross-discipline of AI and Linguistics, dedicated to study the understanding of the text. This is a very challenging area due to unstructured nature of the language, with many ambiguous and corner cases. In this thesis we address a very specific area of NLP that involves the understanding of entities (e.g., names of people, organizations, locations) in text. First, we introduce a radically different, entity-centric view of the information in text. We argue that instead of using individual mentions in text to understand their meaning, we should build applications that would work in terms of entity concepts. Next, we present a more detailed model on how the entity-centric approach can be used for the entity linking task. In our work, we show that this task can be improved by considering performing entity linking at the coreference cluster level rather than each of the mentions individually. In our next work, we further study how information from Knowledge Base entities can be integrated into text. Finally, we analyze the evolution of the entities from the evolving temporal perspective.
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- (25 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.92)
- Law (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- (10 more...)
Machine Learning Technique Predicting Video Streaming Views to Reduce Cost of Cloud Services
Video streams tremendously occupied the highest portion of online traffic. Multiple versions of a video are created to fit the user's device specifications. In cloud storage, Keeping all versions of frequently accessed video streams in the repository for the long term imposes a significant cost paid by video streaming providers. Generally, the popularity of a video changes each period of time, which means the number of views received by a video could be dropped, thus, the video must be deleted from the repository. Therefore, in this paper, we develop a method that predicts the popularity of each video stream in the repository in the next period. On the other hand, we propose an algorithm that utilizes the predicted popularity of a video to compute the storage cost, and then it decides whether the video will be kept or deleted from the cloud repository. The experiment results show a cost reduction of the cloud services by 15% compared to keeping all video streams.
Tesla Autopilot head Andrej Karpathy leaves as company faces renewed crash probes
Tesla Director of Artificial Intelligence and Autopilot Andrej Karpathy is leaving the company at a critical time - as it faces renewed probes over crashes and growing scrutiny. Tesla's head of artificial intelligence and autopilot Andrej Karpathy, pictured above at a conference, is leaving the company at a critical time'It's been a great pleasure to help Tesla towards its goals over the last 5 years and a difficult decision to part ways. In that time, Autopilot graduated from lane keeping to city streets and I look forward to seeing the exceptionally strong Autopilot team continue that momentum,' he wrote on Twitter, noting that he has no plans for what's next. Tesla CEO Elon Musk replied to thank him for his work at the company. The leadership change comes at a challenging time, as Tesla faces renewed scrutiny from US regulators over crashes involving drivers who used Autopilot and works to expand the latest version of Full Self Driving (FSD) to a larger number of customers.
- North America > United States > California > Los Angeles County > Los Angeles (0.15)
- North America > United States > Texas > Harris County > Houston (0.15)
- North America > United States > Ohio > Stark County > Canton (0.05)
- (4 more...)
- Transportation > Ground > Road (1.00)
- Transportation > Electric Vehicle (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- (2 more...)
Table-based Fact Verification with Salience-aware Learning
Wang, Fei, Sun, Kexuan, Pujara, Jay, Szekely, Pedro, Chen, Muhao
Tables provide valuable knowledge that can be used to verify textual statements. While a number of works have considered table-based fact verification, direct alignments of tabular data with tokens in textual statements are rarely available. Moreover, training a generalized fact verification model requires abundant labeled training data. In this paper, we propose a novel system to address these problems. Inspired by counterfactual causality, our system identifies token-level salience in the statement with probing-based salience estimation. Salience estimation allows enhanced learning of fact verification from two perspectives. From one perspective, our system conducts masked salient token prediction to enhance the model for alignment and reasoning between the table and the statement. From the other perspective, our system applies salience-aware data augmentation to generate a more diverse set of training instances by replacing non-salient terms. Experimental results on TabFact show the effective improvement by the proposed salience-aware learning techniques, leading to the new SOTA performance on the benchmark. Our code is publicly available at https://github.com/luka-group/Salience-aware-Learning .
- North America > United States > California (0.14)
- North America > United States > New York > Broome County > Binghamton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
Determining Sentencing Recommendations and Patentability Using a Machine Learning Trained Expert System
Brown, Logan, Pezewski, Reid, Straub, Jeremy
This paper presents two studies that use a machine learning expert system (MLES). One focuses on a system to advise to United States federal judges for regarding consistent federal criminal sentencing, based on both the federal sentencing guidelines and offender characteristics. The other study aims to develop a system that could prospectively assist the U.S. Patent and Trademark Office automate their patentability assessment process. Both studies use a machine learning-trained rule-fact expert system network to accept input variables for training and presentation and output a scaled variable that represents the system recommendation (e.g., the sentence length or the patentability assessment). This paper presents and compares the rule-fact networks that have been developed for these projects. It explains the decision-making process underlying the structures used for both networks and the pre-processing of data that was needed and performed. It also, through comparing the two systems, discusses how different methods can be used with the MLES system.
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- North America > United States > North Dakota > Cass County > Fargo (0.04)
- (5 more...)
Leaked emails from Tesla says its 'Full Self-Driving' beta will 'remain largely unchanged'
Elon Musk has been banging the drum for Tesla's with'Full Self-Driving' (FSD) for more than five years, but a number of leaked emails reveal the technology is far off from providing hands-free capabilities. Documents between Tesla attorneys and the California Department of Motor Vehicles (DMV) say vehicles using the firm's latest beta version, known as'Autosteer on City Streets' will not surpass Level 2 autonomy. This level of autonomy requires drivers to remain aware and control the brake, accelerator and steering - despite Musk promising'full self driving' by 2021. Attorneys for the carmaker said the FSD beta upgrade'does not make it autonomous under the DMV's definition,' along with stating the Level 2 of will'remain largely unchanged' in a full customer rollout. Elon Musk has been banging the drum for Tesla's with'Full Self-Driving' (FSD) for more than five years, but a number of leaked emails reveal the technology is far from providing hands-free capabilities'City Streets continues to firmly root the vehicle in SAE Level 2 capability and does not make it autonomous under the DMV's definition, wrote Eric Williams, Tesla associate general counsel, in a statement attached to an email with the California DMV that has been published to PlainSite.
- North America > United States > Ohio > Stark County > Canton (0.05)
- North America > United States > California > Santa Clara County > Mountain View (0.05)
- North America > United States > California > Alameda County > Berkeley (0.05)
- Asia > China > Beijing > Beijing (0.05)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- Government > Regional Government > North America Government > United States Government (0.31)
Gaming on a Budget? Try Your Local Library
In the immortal words of Arthur the Aardvark, "Havin' fun isn't hard, when you've got a library card!" But how much fun can you really have with a library card? Turns out, more than I expected. Libraries across America are adding video games to their collections available for checkout. Gamers with an incessant appetite for new experiences or anyone looking to play video games for free should contact their local library to see if they have a collection.
- North America > United States > Ohio > Stark County > Canton (0.05)
- North America > United States > Kansas (0.05)